02. Overview of biases
Lesson 3 02 An Overview Of Biases
Text_overview_biases
What is biased data?
The data you work with is flawed, whether you like it or not. It is often never fully representative of the population that you’re attempting to make conclusions on. Take a dataset of Black Friday shoppers’ demographics. Where does this data come from? Depending on the problem you are solving, do the users in this dataset encompass the population you’re attempting to make actionable recommendations on? It’s important to consider where bias can be introduced in the collection, processing, and analysis process and call these caveats out during the recommendation stage.
In this lesson we will be going in depth with the types of biases that can appear in data at various stages. We will address how to identify the type of bias and address it in your presentation to important stakeholders. This is a critical piece of the ghost deck, and an important responsibility for anyone using data to answer questions. The following will be the structure of the lesson.
Overall Structuring of Biases:
- Data Collection
● Selection Bias
● Response Bias
● Missing Variables
● Survivorship Bias
- Data Processing
● Outliers
● Distribution Understanding
● Missingness Understanding
- Data Insights
● Confirmation Bias
● Overfitting/Underfitting
● Confounding Variables